Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Bos, Joppe W; Celi, Sofia; Kannwischer, Matthias J (Ed.)Retrieval Augmented Generation (RAG) can enhance the performance of Large Language Models (LLMs) when used in conjunction with a comprehensive knowledge database. However, the space required to store the necessary information can be taxing when RAG is used locally. As such, the concept of RAG-as-a-Service (RaaS) has emerged, in which third-party servers can be used to process client queries via an external database. Unfortunately, using such a service would expose the client's query to a third party, making the product unsuitable for processing sensitive queries. Our scheme ensures that throughout the entire RAG processing, neither the query, any distances, nor retrieval information is known to the database hosting server. Using a two-pronged approach, we employ Fully Homomorphic Encryption (FHE) and Private Information Retrieval (PIR) to ensure complete security during RAG processing. FHE is used to maintain privacy during initial query processing, during which the query embedding is encrypted and sent to the server for k-means centroid scoring to obtain a similarity ranking. Then, a series of PIR queries is used to privately retrieve the centroid-associated embeddings and the top-ranked documents. A first-of-its-kind, lightweight, fully secure RAG protocol, RAGtime-PIANO, enables efficient secure RAG.more » « less
-
Dietterich, Thomas (Ed.)Existing retrieval-augmented generation (RAG) systems typically use a centralized architecture, causing a high cost of data collection, integration, and management, as well as privacy concerns. There is a great need for a decentralized RAG system that enables foundation models to utilize information directly from data owners who maintain full control over their sources. However, decentralization brings a challenge: the numerous independent data sources vary significantly in reliability, which can diminish retrieval accuracy and response quality. To address this, our decentralized RAG system has a novel reliability scoring mechanism that dynamically evaluates each source based on the quality of responses it contributes to generate and prioritizes high-quality sources during retrieval. To ensure transparency and trust, the scoring process is securely managed through blockchain-based smart contracts, creating verifiable and tamper-proof reliability records without relying on a central authority. We evaluate our decentralized system with two Llama models (3B and 8B) in two simulated environments where six data sources have different levels of reliability. Our system achieves a +10.7\% performance improvement over its centralized counterpart in the real world-like unreliable data environments. Notably, it approaches the upper-bound performance of centralized systems under ideally reliable data environments. The decentralized infrastructure enables secure and trustworthy scoring management, achieving approximately 56\% marginal cost savings through batched update operations. Our code and system are open-sourced at this http URL.more » « less
-
AbstractComputational methods and machine learning (ML) are reshaping materials science by accelerating their discovery, design, and optimization. Traditional approaches such as density functional theory and molecular dynamics have been instrumental in studying materials at the atomic level. However, their high computational cost and, in certain cases, limited accuracy can restrict the scope ofin silicoexploration. ML promises to accelerate material property prediction and design. However, in many areas, the volume and fidelity of the data are critical barriers. Active learning can reduce the reliance on large data sets, and simulation has emerged as a critical tool for generating data on the fly. Despite these advances, challenges remain, particularly in data quality, model interpretability, and bridging the gap between computational predictions and experimental validation. Future research should develop automated frameworks capable of designing and testing materials for specific applications, and integrating ML with traditional simulations and experiments can contribute to this goal. Graphic abstractmore » « less
-
Multimodal Large Language Models (MLLMs) have demonstrated impressive abilities across various tasks, including visual question answering and chart comprehension, yet existing benchmarks for chart-related tasks fall short in capturing the complexity of real-world multi-chart scenarios. Current benchmarks primarily focus on single-chart tasks, neglecting the multi-hop reasoning required to extract and integrate information from multiple charts, which is essential in practical applications. To fill this gap, we introduce MultiChartQA, a benchmark that evaluates MLLMs’ capabilities in four key areas: direct question answering, parallel question answering, comparative reasoning, and sequential reasoning. Our evaluation of a wide range of MLLMs reveals significant performance gaps compared to humans. These results highlight the challenges in multi-chart comprehension and the potential of MultiChartQA to drive advancements in this field. Our code and data are available at https://github.com/Zivenzhu/Multi-chart-QA.more » « less
-
Generative models such as Large Language Models (LLM) and Multimodal Large Language models (MLLMs) trained on massive web corpora can memorize and disclose individuals’ confidential and private data, raising legal and ethical concerns. While many previous works have addressed this issue in LLM via machine unlearning, it remains largely unexplored for MLLMs. To tackle this challenge, we introduce Multimodal Large Language Model Unlearning Benchmark (MLLMU-Bench), a novel benchmark aimed at advancing the understanding of multimodal machine unlearning. MLLMU-Bench consists of 500 fictitious profiles and 153 profiles for public celebrities, each profile feature over 14 customized question-answer pairs, evaluated from both multimodal (image+text) and unimodal (text) perspectives. The benchmark is divided into four sets to assess unlearning algorithms in terms of efficacy, generalizability, and model utility. Finally, we provide baseline results using existing generative model unlearning algorithms. Surprisingly, our experiments show that unimodal unlearning algorithms excel in generation tasks, while multimodal unlearning approaches perform better in classification with multimodal inputs.more » « less
An official website of the United States government

Full Text Available